Overview

Dataset statistics

Number of variables10
Number of observations456727
Missing cells0
Missing cells (%)0.0%
Duplicate rows7609
Duplicate rows (%)1.7%
Total size in memory26.1 MiB
Average record size in memory60.0 B

Variable types

Numeric6
Categorical2
DateTime1
Text1

Alerts

Dataset has 7609 (1.7%) duplicate rowsDuplicates
CustID is highly overall correlated with ZipCode_FrequencyHigh correlation
Month is highly overall correlated with YearHigh correlation
Year is highly overall correlated with MonthHigh correlation
ZipCode_Frequency is highly overall correlated with CustIDHigh correlation
Year is highly imbalanced (65.0%)Imbalance
playscount is highly skewed (γ1 = 29.65574724)Skewed

Reproduction

Analysis started2024-01-26 00:36:42.984741
Analysis finished2024-01-26 00:40:22.578039
Duration3 minutes and 39.59 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

CustID
Real number (ℝ)

HIGH CORRELATION 

Distinct5000
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2027.6046
Minimum0
Maximum4999
Zeros397
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2024-01-26T02:40:22.672184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile86
Q1687
median1801
Q33251
95-th percentile4628
Maximum4999
Range4999
Interquartile range (IQR)2564

Descriptive statistics

Standard deviation1479.8317
Coefficient of variation (CV)0.72984234
Kurtosis-1.1125144
Mean2027.6046
Median Absolute Deviation (MAD)1235
Skewness0.3613793
Sum9.2606177 × 108
Variance2189901.8
MonotonicityIncreasing
2024-01-26T02:40:22.797576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 397
 
0.1%
1 389
 
0.1%
2 366
 
0.1%
3 355
 
0.1%
4 353
 
0.1%
5 350
 
0.1%
6 346
 
0.1%
9 341
 
0.1%
7 338
 
0.1%
8 327
 
0.1%
Other values (4990) 453165
99.2%
ValueCountFrequency (%)
0 397
0.1%
1 389
0.1%
2 366
0.1%
3 355
0.1%
4 353
0.1%
5 350
0.1%
6 346
0.1%
7 338
0.1%
8 327
0.1%
9 341
0.1%
ValueCountFrequency (%)
4999 61
< 0.1%
4998 65
< 0.1%
4997 47
< 0.1%
4996 62
< 0.1%
4995 67
< 0.1%
4994 67
< 0.1%
4993 57
< 0.1%
4992 67
< 0.1%
4991 63
< 0.1%
4990 53
< 0.1%

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
0
273694 
1
183033 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters456727
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

Length

2024-01-26T02:40:22.891834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-26T02:40:22.986305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

Most occurring characters

ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 456727
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

Most occurring scripts

ValueCountFrequency (%)
Common 456727
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 456727
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 273694
59.9%
1 183033
40.1%

zip
Real number (ℝ)

Distinct4695
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50276.704
Minimum1002
Maximum99347
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2024-01-26T02:40:23.080690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1002
5-th percentile6787
Q127502
median49925
Q373526
95-th percentile95006
Maximum99347
Range98345
Interquartile range (IQR)46024

Descriptive statistics

Standard deviation27547.412
Coefficient of variation (CV)0.54791602
Kurtosis-1.1272469
Mean50276.704
Median Absolute Deviation (MAD)23124
Skewness0.030376972
Sum2.2962728 × 1010
Variance7.5885989 × 108
MonotonicityNot monotonic
2024-01-26T02:40:23.206184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
72132 475
 
0.1%
4042 452
 
0.1%
17307 389
 
0.1%
66216 366
 
0.1%
5341 364
 
0.1%
57445 361
 
0.1%
50858 360
 
0.1%
48915 359
 
0.1%
36690 355
 
0.1%
61377 353
 
0.1%
Other values (4685) 452893
99.2%
ValueCountFrequency (%)
1002 154
< 0.1%
1037 54
 
< 0.1%
1066 88
 
< 0.1%
1082 98
< 0.1%
1092 89
 
< 0.1%
1115 68
 
< 0.1%
1195 60
 
< 0.1%
1235 102
< 0.1%
1253 230
0.1%
1256 117
< 0.1%
ValueCountFrequency (%)
99347 82
 
< 0.1%
99346 81
 
< 0.1%
99330 53
 
< 0.1%
99256 241
0.1%
99223 66
 
< 0.1%
99206 61
 
< 0.1%
99176 89
 
< 0.1%
99173 67
 
< 0.1%
99158 120
< 0.1%
99153 198
< 0.1%
Distinct455
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
Minimum2011-05-10 00:00:00
Maximum2013-07-31 00:00:00
2024-01-26T02:40:23.331373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:40:23.442467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Artist
Text

Distinct357
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2024-01-26T02:40:23.732407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length29
Mean length11.188966
Min length3

Characters and Unicode

Total characters5110303
Distinct characters42
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowblue and gold
2nd rowgirls
3rd rowi love you
4th rowone scotch
5th rowtrain
ValueCountFrequency (%)
the 33812
 
3.9%
band 17168
 
2.0%
bob 13654
 
1.6%
billy 13200
 
1.5%
12502
 
1.4%
john 12433
 
1.4%
beatles 9914
 
1.1%
brothers 9528
 
1.1%
alice 9461
 
1.1%
david 8200
 
0.9%
Other values (507) 729887
83.9%
2024-01-26T02:40:24.243783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 588332
 
11.5%
413032
 
8.1%
a 384554
 
7.5%
r 349085
 
6.8%
o 346097
 
6.8%
n 308519
 
6.0%
l 291886
 
5.7%
i 263635
 
5.2%
t 260320
 
5.1%
s 259977
 
5.1%
Other values (32) 1644866
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4625772
90.5%
Space Separator 413032
 
8.1%
Other Punctuation 41335
 
0.8%
Decimal Number 23899
 
0.5%
Dash Punctuation 6265
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 588332
12.7%
a 384554
 
8.3%
r 349085
 
7.5%
o 346097
 
7.5%
n 308519
 
6.7%
l 291886
 
6.3%
i 263635
 
5.7%
t 260320
 
5.6%
s 259977
 
5.6%
c 213366
 
4.6%
Other values (18) 1360001
29.4%
Decimal Number
ValueCountFrequency (%)
3 8767
36.7%
8 5788
24.2%
1 3849
16.1%
0 2996
 
12.5%
2 1422
 
6.0%
5 569
 
2.4%
6 508
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 15278
37.0%
& 11811
28.6%
/ 6110
 
14.8%
' 4265
 
10.3%
? 3871
 
9.4%
Space Separator
ValueCountFrequency (%)
413032
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6265
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4625772
90.5%
Common 484531
 
9.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 588332
12.7%
a 384554
 
8.3%
r 349085
 
7.5%
o 346097
 
7.5%
n 308519
 
6.7%
l 291886
 
6.3%
i 263635
 
5.7%
t 260320
 
5.6%
s 259977
 
5.6%
c 213366
 
4.6%
Other values (18) 1360001
29.4%
Common
ValueCountFrequency (%)
413032
85.2%
. 15278
 
3.2%
& 11811
 
2.4%
3 8767
 
1.8%
- 6265
 
1.3%
/ 6110
 
1.3%
8 5788
 
1.2%
' 4265
 
0.9%
? 3871
 
0.8%
1 3849
 
0.8%
Other values (4) 5495
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5108550
> 99.9%
None 1753
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 588332
 
11.5%
413032
 
8.1%
a 384554
 
7.5%
r 349085
 
6.8%
o 346097
 
6.8%
n 308519
 
6.0%
l 291886
 
5.7%
i 263635
 
5.2%
t 260320
 
5.1%
s 259977
 
5.1%
Other values (30) 1643113
32.2%
None
ValueCountFrequency (%)
ö 1384
79.0%
ÿ 369
 
21.0%

playscount
Real number (ℝ)

SKEWED 

Distinct126
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8832344
Minimum1
Maximum449
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-01-26T02:40:24.353959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum449
Range448
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.156845
Coefficient of variation (CV)1.676289
Kurtosis2358.9595
Mean1.8832344
Median Absolute Deviation (MAD)0
Skewness29.655747
Sum860124
Variance9.9656706
MonotonicityNot monotonic
2024-01-26T02:40:24.463977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 307175
67.3%
2 80652
 
17.7%
3 29610
 
6.5%
4 13269
 
2.9%
5 7089
 
1.6%
6 4330
 
0.9%
7 3074
 
0.7%
8 2244
 
0.5%
9 1702
 
0.4%
10 1340
 
0.3%
Other values (116) 6242
 
1.4%
ValueCountFrequency (%)
1 307175
67.3%
2 80652
 
17.7%
3 29610
 
6.5%
4 13269
 
2.9%
5 7089
 
1.6%
6 4330
 
0.9%
7 3074
 
0.7%
8 2244
 
0.5%
9 1702
 
0.4%
10 1340
 
0.3%
ValueCountFrequency (%)
449 1
< 0.1%
385 1
< 0.1%
321 1
< 0.1%
294 1
< 0.1%
242 1
< 0.1%
239 1
< 0.1%
234 1
< 0.1%
199 1
< 0.1%
190 1
< 0.1%
185 1
< 0.1%

Year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.5 MiB
2013
400760 
2012
54721 
2011
 
1246

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1826908
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013
2nd row2013
3rd row2013
4th row2013
5th row2013

Common Values

ValueCountFrequency (%)
2013 400760
87.7%
2012 54721
 
12.0%
2011 1246
 
0.3%

Length

2024-01-26T02:40:24.573974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-26T02:40:24.667935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2013 400760
87.7%
2012 54721
 
12.0%
2011 1246
 
0.3%

Most occurring characters

ValueCountFrequency (%)
2 511448
28.0%
1 457973
25.1%
0 456727
25.0%
3 400760
21.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1826908
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 511448
28.0%
1 457973
25.1%
0 456727
25.0%
3 400760
21.9%

Most occurring scripts

ValueCountFrequency (%)
Common 1826908
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 511448
28.0%
1 457973
25.1%
0 456727
25.0%
3 400760
21.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1826908
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 511448
28.0%
1 457973
25.1%
0 456727
25.0%
3 400760
21.9%

Month
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7306531
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-01-26T02:40:24.762435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median6
Q37
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.3793526
Coefficient of variation (CV)0.41519746
Kurtosis0.69417649
Mean5.7306531
Median Absolute Deviation (MAD)1
Skewness0.40153434
Sum2617344
Variance5.6613188
MonotonicityNot monotonic
2024-01-26T02:40:24.856857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
7 126553
27.7%
6 91074
19.9%
5 67254
14.7%
4 48973
 
10.7%
3 36089
 
7.9%
2 21035
 
4.6%
1 20452
 
4.5%
12 16186
 
3.5%
11 10964
 
2.4%
10 8366
 
1.8%
Other values (2) 9781
 
2.1%
ValueCountFrequency (%)
1 20452
 
4.5%
2 21035
 
4.6%
3 36089
 
7.9%
4 48973
 
10.7%
5 67254
14.7%
6 91074
19.9%
7 126553
27.7%
8 4447
 
1.0%
9 5334
 
1.2%
10 8366
 
1.8%
ValueCountFrequency (%)
12 16186
 
3.5%
11 10964
 
2.4%
10 8366
 
1.8%
9 5334
 
1.2%
8 4447
 
1.0%
7 126553
27.7%
6 91074
19.9%
5 67254
14.7%
4 48973
 
10.7%
3 36089
 
7.9%

Day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.31602
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 MiB
2024-01-26T02:40:24.966190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q19
median17
Q324
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.8165692
Coefficient of variation (CV)0.54036273
Kurtosis-1.1997137
Mean16.31602
Median Absolute Deviation (MAD)8
Skewness-0.078498956
Sum7451967
Variance77.731892
MonotonicityNot monotonic
2024-01-26T02:40:25.078039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
23 18066
 
4.0%
27 18038
 
3.9%
26 17609
 
3.9%
19 16514
 
3.6%
13 16088
 
3.5%
20 15770
 
3.5%
30 15616
 
3.4%
18 15448
 
3.4%
16 15410
 
3.4%
10 15393
 
3.4%
Other values (21) 292775
64.1%
ValueCountFrequency (%)
1 12276
2.7%
2 14789
3.2%
3 14528
3.2%
4 14841
3.2%
5 12517
2.7%
6 13837
3.0%
7 14187
3.1%
8 13195
2.9%
9 14733
3.2%
10 15393
3.4%
ValueCountFrequency (%)
31 9560
2.1%
30 15616
3.4%
29 14916
3.3%
28 15354
3.4%
27 18038
3.9%
26 17609
3.9%
25 15240
3.3%
24 15257
3.3%
23 18066
4.0%
22 15287
3.3%

ZipCode_Frequency
Real number (ℝ)

HIGH CORRELATION 

Distinct255
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean119.2205
Minimum45
Maximum475
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2024-01-26T02:40:25.187042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum45
5-th percentile60
Q174
median96
Q3145
95-th percentile259
Maximum475
Range430
Interquartile range (IQR)71

Descriptive statistics

Standard deviation63.68349
Coefficient of variation (CV)0.5341656
Kurtosis3.4478556
Mean119.2205
Median Absolute Deviation (MAD)27
Skewness1.7399225
Sum54451221
Variance4055.5869
MonotonicityNot monotonic
2024-01-26T02:40:25.297225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70 8470
 
1.9%
68 8432
 
1.8%
71 7810
 
1.7%
69 7797
 
1.7%
75 7425
 
1.6%
65 7345
 
1.6%
73 7227
 
1.6%
72 6984
 
1.5%
66 6864
 
1.5%
80 6720
 
1.5%
Other values (245) 381653
83.6%
ValueCountFrequency (%)
45 45
 
< 0.1%
46 92
 
< 0.1%
47 188
 
< 0.1%
48 144
 
< 0.1%
49 294
 
0.1%
50 700
0.2%
51 561
 
0.1%
52 832
0.2%
53 1325
0.3%
54 1512
0.3%
ValueCountFrequency (%)
475 475
0.1%
452 452
0.1%
389 389
0.1%
366 366
0.1%
364 364
0.1%
361 361
0.1%
360 360
0.1%
359 359
0.1%
355 355
0.1%
353 353
0.1%

Interactions

2024-01-26T02:39:58.410780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:36:58.160330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:20.509022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:47.594500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.728602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:34.807692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:58.630975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:36:58.376456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:31.480732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:47.794617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.933141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:35.011894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:40:21.382004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:19.932421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:05.379137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.257815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:34.336009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:57.923264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:40:21.492170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:20.053501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:15.763192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.367652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:34.430464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:58.033313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:40:21.617599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:20.179161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:26.088643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.477509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:34.556158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:58.158995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:40:21.728199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:37:20.289218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:38:37.118371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:10.618613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:34.681907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-26T02:39:58.269563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-26T02:40:25.359875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
CustIDDayGenderMonthYearZipCode_Frequencyplayscountzip
CustID1.0000.0240.0360.0020.045-0.830-0.205-0.012
Day0.0241.0000.0300.0160.043-0.022-0.009-0.030
Gender0.0360.0301.0000.0010.015-0.008-0.0050.017
Month0.0020.0160.0011.0000.6320.000-0.000-0.005
Year0.0450.0430.0150.6321.000-0.017-0.0050.032
ZipCode_Frequency-0.830-0.022-0.0080.000-0.0171.0000.1750.025
playscount-0.205-0.009-0.005-0.000-0.0050.1751.0000.001
zip-0.012-0.0300.017-0.0050.0320.0250.0011.000

Missing values

2024-01-26T02:40:21.853850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-26T02:40:22.128389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CustIDGenderzipSignDateArtistplayscountYearMonthDayZipCode_Frequency
000721322013-06-04blue and gold11201364475
100721322013-06-04girls2201364475
200721322013-06-04i love you3201364475
300721322013-06-04one scotch4201364475
400721322013-06-04train17201364475
500721322013-06-04.38 special242201364475
600721322013-06-0410cc45201364475
700721322013-06-043 doors down70201364475
800721322013-06-04ac/dc449201364475
900721322013-06-04aerosmith12201364475
CustIDGenderzipSignDateArtistplayscountYearMonthDayZipCode_Frequency
45671749991336622012-11-10night ranger120121110152
45671849991336622012-11-10pat travers120121110152
45671949991336622012-11-10paul mccartney220121110152
45672049991336622012-11-10peter gabriel120121110152
45672149991336622012-11-10pink floyd220121110152
45672249991336622012-11-10police120121110152
45672349991336622012-11-10queens of the stone age120121110152
45672449991336622012-11-10queensryche120121110152
45672549991336622012-11-10the guess who120121110152
45672649991336622012-11-10tom petty120121110152

Duplicate rows

Most frequently occurring

CustIDGenderzipSignDateArtistplayscountYearMonthDayZipCode_Frequency# duplicates
110173072013-07-27crosby620137273893
118340837172013-06-11crosby120136112513
183530403722013-05-02crosby12013522363
282851723592013-06-27paul mccartney120136272193
4061320677492013-06-18crosby120136183033
430138180732013-05-20crosby120135202653
7192480488702013-06-05crosby12013651893
12414741186512013-07-22paul mccartney120137221373
15536210637852013-05-30paul mccartney120135301263
16566660251132013-06-18crosby120136181343